Existential Risk and Al Governance 


...a fascinating and potentially terrifying discussion 

~ John Danaher 

’Tis not contrary to reason to prefer the destruction of the whole world to the scratching of my finger.’ 
~ David Hume 

...some little idiot is bound to press the ignite button? 

~ Nick Bostrom 

There are multiple ways to become extinct. To evolve very fast is one not often considered. 


~ Bianco Luno 


Back when our technologies only enabled us to kill ourselves along with a subset of humanity 
this may not have been a live topic. Apparently, it is now... What would Second Amendment 
advocates think of legalizing the private possession of nuclear, biological, and chemical 
weapons?... 


1. A Treatise of Human Nature, 1739, [1896 ed.], Book II, Partt III, Sect. II, p. 416. 
2. Superintelligence: Paths, Dangers, Strategies, Oxford, UK: Oxford University Press, 2014. 


But, actually, it doesn’t matter whether we legalize such weapons because, legal or not, 
technology is well on the way to making their private possession inevitable. And with that, their 
use. A historian will tell you that whatever human beings have been able to do, they do. It is just 
a matter of time before the deed, too, is history. And any believer that nature is uniform in space 
and time—that past events have a bearing on future ones? (an assumption enabling scientists to 
do what they do)—must expect we will do what we are able to.* Forget ethics, morality—forget 
“random acts of kindness,” and the hold on or appeal these have for most of us. It only takes one 
random act of badness to finish us off—to preclude any further acts of any sort by the likes of us. 
This vulnerability has always been true for each of us taken as individuals. Suicide is nothing 
new. What is new is that soon one of us can do us all in. 


Imagine a genetically-engineered disease as contagious as measles, as deadly as rabies, and with 
the incubation period of Kuru disease: It would be easy to contract and years before you knew 
you had, and it would have sufficient time to infect everyone on earth but the most reclusive— 
and those, too, likely, in due course... 


We have no track record of restraint. The likelihood that such weapons, in the context of almost 
eight billion people, will be used by one individual or a small group of non-state actors with 
omnicidal intentions is very real. It is a concern of a group of philosophers who are asking what 
may end our species, either as we know it, or without possibility of our being able to identify 


3. Hume thought the “principle of the uniformity of nature” was irrational. But “common sense” and anyone doing 
science doesn’t want to hear this. 

4. What is the evidence for the hope that we will someday develop such restraint? What character or shape may this 
evidence take? Consider Steven Pinker’s thesis that we are evolving into a gentler species. This idea has two central 
problems. First, Pinker shows more confidence in Enlightenment rationality than the Enlightenment thinkers he 
wants to celebrate themselves ever did. They were “enlightened” precisely—not because they “knew” better than 
their endarkened forebears but—because they were willing to subject prior dogmatisms to rational scrutiny. But 
reason, as Hume famously implied, has no content or agenda of its own. It is mere tool for facilitating the expression 
of passions, and these are not all gentle. Second, Pinker seems to conceive of moral goodness as static: it is a target, 
he thinks, we are closer to now than we used to be. By a less “violent” human character, he measures only physical 
transgression as though the direction of moral development were not advancing toward less tangible forms of 
transgression with social, psychological, and even aesthetic characters. But if there is something to the idea that we 
have made moral progress, it is evident in the observation that these sophisticated “violences” are proliferating. The 
moral horizon is moving away from us at a faster rate than the moral “progress” Pinker, in particular, envisions. 
Take the expansion of “rights” and “interests” talk to categories unthinkable only decades ago, for example. Peter 
Singer’s describes with approval the notion of an ever-widening circle of moral regard, for another. He cites in the 
opening passages of his seminal 1975 book Animal Liberation the instance of Thomas Taylor, an early 19th Century 
Cambridge Platonist who satirically reacted to stirrings of modern feminism (he had Mary Wollstonecraft in mind) 
with the parody, A Vindication of the Rights of Brutes. If we give women rights, soon we’ll be granting rights to 
animals, and that, thought Taylor, was preposterous! But today, environmental ethicists are nominating inanimate 
landscapes and ecosystems for proper rights-holding candidates. More recently still, Thomas Metzinger considers 
the rights of artificial intelligences... The end game is that with the expansion of moral regard comes ever 
increasing opportunities for evil. And the expansion is, itself, a moral imperative. With those opportunities come 
actual instances of evil. Again, we have no history of not availing ourselves of these new opportunities. 


with any successor species. (There are multiple ways to become extinct. To evolve very fast is 
one not often considered.) They are also asking what, if anything, is open to us to do about it. 


The problem is that a self-extinction event need happen only once. It is a singular event. Concern 
about it does not require probability (though there is the non-negligibility of that to consider). It 
is enough that it is possible—when once it was not—to inspire serious concern. 


Phil Torres, a researcher at the Centre for the Study of Existential Risk at Cambridge University, 
pursues these thoughts to the interesting conclusion that we must give up on human governance 
altogether and replace it with a “benign AI god” able to surveill and regulate us in a way we 
don’t have it in us to do ourselves. 


If you thought self-driving cars were an advanced application of AI... 


Torres extensively worries the topic with John Danaher at philosophicaldisquisitions.com. 


[This is the fourth in a series of topics on artificial intelligence and their philosophical 


implications: Part 1 - consciousness and suffering, Part 2 - robot love, and Part 3 - self-driving 
cars. | 


Resources 
1. “Torres on Existential Risk, Omnicidal Agents and Superintelligence,” John Danaher 
interviews Phil Torres at Philosophical Disquisitions. This podcast is essential listening 
for this topic. The discussion is exhaustive and Danaher provides a list of more resources. 


2. “Superintelligence and the Future of Governance: On Prioritizing the Control Problem at 
the End of History,” Phil Torres, forthcoming in Artificial Intelligence Safety and 


Security (ed. Roman Yampolskiy). 


3. “Nick Bostrom, Technology That Could End Humanity—and How to Stop It.” Wired 
interview with one of the pioneers of the existential risk topic. 


4. “Surveillance of digital life and the use of sousveillance as a response,” Sam Shepherd, 
Medium, 2015. 


5. “How do you teach a machine right from wrong? Addressing the morality within 
Artificial Intelligence: Reality is catching up to fiction,” Joseph Brean, National Post. 


6. “Slaughterbots,” a video short dramatizing how drones might be used to enforce a 
programmed mandate to keep us all in line. Who would decide what “in line” means? 
And who would do the enforcing? Humans, who can’t be trusted? Or a “benign AI god,” 
who might rob us of a reason to exist? (Steven Pinker, where are our “better angels”?) 


7. “The Great Filter.” Physicist Enrico Fermi once asked “where are they?” The 


extraterrestrial intelligences that the Drake Equation later predicted there should be, that 
small but reasonable number, around 20, on conservative assumptions? How come we 
haven’t encountered even one? The Great Filter theory attempts an explanation. 


8. “The end of humanity: Nick Bostrom at TEDxOxford.” Description: “Swedish 
philosopher Nick Bostrom began thinking of a future full of human enhancement, 


nanotechnology and cloning long before they became mainstream concerns. Bostrom 
approaches both the inevitable and the speculative using the tools of philosophy, 
bioethics and probability.” 


9. “Existential Risk & Human Extinction: An Intellectual History,” Thomas Moynihan. 
Description: “Of late, ‘existential risks’ have become the target of rigorous study. We are 


becoming ever more conversant with risks of increasing scope and severity. Yet, this 
dynamic, of our growing concern for our own extinction, itself has a history. 
Accordingly, this talk attempts to supply a history to the first discovery of the idea of 
human extinction. We first became gripped by such prospects during the Age of 
Enlightenment. The prime reason why anticipations of our extinction were absent prior to 
this era was the age-old conviction that it is the nature of the universe to be as full of 
value as is possible. Thus, should we die out today, humans (or axiologically equitable 
beings) would simply reemerge tomorrow. Extinction could have no stakes. This talk 
explores the unbinding of such cosmic nonchalance, and our realisation of the 
precariousness and preciousness of terrestrial intelligence, tracing it through the 
philosophical and scientific breakthroughs of the Age of Reason. This talk was given by 
Thomas Moynihan on 30th April, 2019.” The history, it seems, of our last days... 


Existential risk 

To avoid misunderstandings about the topic, existential risk, I stress that it isn’t particularly 
about finding fault with technology, or people, or even the people who demand and use 
technology (which is just about all of 


us). It’s about something much 
bigger... 


“Box cutter,” 2002, Tobias Wong 


—basic technology’ 


It’s not a debate about whether 


5. A bit of basic technology in the causal history of a 5.6 trillion dollar war and hundreds of thousands of deaths. 
Tobias Wong’s silver-plated box cutter. 


technology kills people or whether people kill people. Both are true, but that’s not the issue. Both 
have been happening since the beginning of the species. We can foresee, however, that 
something is different now. 


The real issue is: does evolution’ kill people? Is it inevitable that one of these two things will 
happen in an already foreseeable future?: 


1. We become extinct as a species, or 
2. We transform into something that we won’t or can’t identify with. 


Serious thinkers are suggesting this is the dilemma we are facing. If the former, the game is over, 
if the latter, the game changes into one we may have no interest in playing. 


If you think this is a false dilemma, we will lay out the premises of an argument with that 
dilemma as a conclusion. We can then take a hard look as to whether we believe each premise is 
true, or whether, even if the premises are true, we must accept the dilemmic conclusion. A 
premise may be false or the logic faulty. 


Finally, even if the premises are true, and the conclusion follows, what attitude should we take to 
it? 


The problem 


Torres offers these as premises: 


(i) The Threat of Universal Unilateralism: Emerging technologies are enabling a rapidly growing 
number of nonstate actors to unilaterally inflict unprecedented harm on the global village; this 
trend of mass empowerment is significantly increasing the probability of an existential 
catastrophe—and could even constitute a Great Filter (Sotos 2017). 


(ii) The Preemption Principle: If we wish to obviate an existential catastrophe, then societies will 
need a way to preemptively avert not just most but all possible attacks with existential 
consequences, since the consequences of an existential catastrophe are by definition irreversible. 


(iii) The Need for a Singleton: The most effective way to preemptively avert attacks is through 
some regime of mass surveillance that enables governing bodies to monitor the actions, and 
perhaps even the brain states, of citizens; ultimately, this will require the formation of a 
singleton. 


6. “Evolution” here means the natural processes involved in explaining our species’ existence at this time and place. 
Non-natural processes are ruled out. Prior events interacting with current events explain all future events. There is 
no room for an exception on a naturalistic understanding of the world. This sense of evolution entails that not only 
our biologies are determined (causally explained) by natural forces and laws, but our psychologies, sociologies, 
cultures, and values are as well. (For something to escape, we would have to appeal to extra-natural causes.) And we 
have only the history of these natural events to construct probable future ones. 


(iv) The Threat of State Dissolution: The trend of (i) will severely undercut the capacity of 
governing bodies to effectively monitor their citizens, because the capacity of states to provide 
security depends upon a sufficiently large “power differential” between themselves and their 
citizens. 


(v) The Limits of Security: If states are unable to effectively monitor their citizens, they will be 
unable to neutralize the threat posed by (i), thus resulting in a high probability of an existential 
catastrophe.’ 
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Possible solutions/non-solutions 
“The Great Filter” 


The universe that has nothing to do with us or 
anything about us. Nature seems to have rules that 
preclude life... or, at the very least, life anywhere 
at, or in excess of, our sophistication. Our existence 
is a freak of nature. We are an error, a mistake. 
Nature will correct this. It will erase us. It always 
does. It does this to all developments of life, 
usually at much earlier stages. We are doomed... Time = 


Capacity to Destroy Civilization 


A~n a 
State actors 
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Or... 


Nature already has tried to eliminate us and, against all odds, we are still here. We somehow 
passed through the filter and made it out on the other side. We were not supposed to happen. But 
we did. We have a chance now to conquer the universe! We have what it takes! 


But the downside of our luck is that we are likely alone. Either nowhere else in the universe has 
anything as freakish as us ever come about, or the event is so distant in time and space that we 
shall never encounter others like us. This explains Fermi’s Paradox. Enrico Fermi at Los Alamos 
Labs in the 1940’s asked “where are they?” It seemed given the vastness of the universe there 
must be others. 


But remember, we broke natural norms. We bent laws of nature. We are freaks. Freaks all alone. 
The logic behind this is called the Great Filter. 


The problem is that we don’t know if we have already passed through the Great Filter or if it is 
still lying in wait for us. 


We are hurtling rapidly toward finding out. Indications are, so existential risk researchers think, 
we will learn sooner than later. 


7. “Superintelligence and the Future of Governance: On Prioritizing the Control Problem at the End of History,” Phil 
Torres, forthcoming in Artificial Intelligence Safety and Security (ed. Roman Yampolskiy), p 2. 


On identity and discernibility — why this is important 

Because when we talk about things happening or not happening to us, whether as an individual 
or species, whether in the past or in the future, we need to be clear what it means to declare a 
thing at one time to be the same thing as at another time. In what sense, is it true that I am the 
same person today as I was yesterday or will be tomorrow? Clarifying identity is an ancient 
philosophical problem. Perduring, that is, continuing to exist through one period of time and into 
another, is not as simple as you may think. There are different kinds of identity over time. Some 
may matter to us more than others. Which and why? 


Logical identity 

In the context of species survival, what kind of thing do we imagine ourselves to be as a species? 
A thing that will last forever? Or, a finite thing with an end as it is with each individual among 
us? 


And quite apart from that existential question (or conceit) is the normative question: given that it 
may be logically—whether physically or not—possible for us to last as long as the universe that 
contains us, ought we to last that long? If it is possible, should we try? 


Or is it not even conceptually possible? If it is not conceptually possible, then the moral question 
is moot, it disappears. It disappears because we have no idea what sense to give the notion that 
we might exist indefinitely. 


The logical relations involved here are identity and similarity. Identity is all-or-nothing. 
Similarity admits of degrees. It is important to specify which of these two concepts is relevant 
because the logic of each is quite distinct. 


Am I the same (as in logically identical) person now as I was when I was ten years old? No, 
already to say I “was” implies I am no longer, which entails I am, at least in that regard, 
different. I cannot be both logically identical with and distinct from something. I am not identical 
with the ten year-old because we are discernible. Discernibility is for an assertion about one 
thing not to be true of another. It is true the person at ten then is not ten now. It is not true of the 
now person “me” that I am ten. 


That leaves the relation of similarity. Certainly, there are resemblances between me and the ten 
year-old. Sometimes resemblance or similarity is adequate for alternative, non-logical, forms of 
identity, such as psychological, socio/cultural/legal, moral, etc. Sometimes not. Exactly when 
similarity is adequate for these senses of identity and when not is unclear. They are intrinsically 
ambiguous in a way pure logical identity is not. In fact, the ambiguity is itself a part of the 
identity of these concepts. Not so with logical? identity. But maybe we don’t care about logical 
identity. Maybe we can set it aside as unimportant... 


8. Aka, “numerical” identity. See SEP on Identity. 


Psychological identity 

Ordinarily, I take myself to be psychological identical with the person I was yesterday if the 
person I am today can remember himself in existence yesterday. Still, but probably a little less 
so, this is true of the “me” one year ago. Even less a decade or more ago. When I was ten? I 
want to say, at least, a little, but honestly, it is far less clear what this means even to myself. My 
“self” is progressively less connected with more distant and less vivid events—both those past 
(in memory) and those yet to occur (in imagination/anticipation). 


Socio/cultural/legal identity 


Family, of course, those who knew me then and know me now would say, yes, I am identical 
now with the person we agree is me in a childhood photo. I have the same citizenship status as 
that kid had. The same legal name. I have the same home town, birthday, etc. 


But even if any or all of these things were different or uncertain it would not necessarily affect 
my psychological identity.” There may be contingent connections between psychological identity 
and these more “other-involved” identities but not necessary ones. 


Moral identity 

This least discussed kind of identity but the most relevant one for addressing questions of 
whether we could fathom what it would mean to perdure indefinitely is the sense of moral 
identity in which rational agency is centrally implicated. Moral (or agential) identity is tied to a 
sense of responsibility or mission or duty and whatever it takes as guiding principles. Whether I 
identify with the author of acts or thoughts attributed to me is relevant to whether I have an 
identity, a self, or any kind of presence in play. I do not have this “presence” regarding events 
that occurred before my earliest memories or beyond anticipations I venture to have about the 
future. About World War II or the Jurassic Period, I can only draw on information or experiences 
to which others have exposed me or inferences drawn from these. About the world one hundred 
years hence or post our extinction, I can only share or participate in speculation. 


I do not have a presence in either that past or that future in which nothing I could think or do 
could possibly have effect. Or in either that past or that future in which nothing exists with 
which I could empathetically identify. There are limits to empathy. The limits might be 
transcended, but the more transcended the less J can have a present empathy for. And if we have 
in mind the future empathy of a future entity, then it appears that would presuppose the 
possibility of an empathic identity of some kind. What would anchor, or link, the empathy 
between me now and whatever, it is suggested, is “me” then? Hence, to the extent we can 


9. It happens, in my particular case, that much of this early information is filled with uncertainty. I have almost no 
knowledge of my biological father. I grew up knowing as parents only my mother and step-father. My name was 
legally altered when I was eight. A large dimension of my origin is to this day unknown or unclear to me. Am | 
different from the norm because of this? Psychologically so? In some other significant sense, different? 


empathize, we have not transcended. If the purported future “me” is too different from the me I 
am intimate with now, what can cause me now to identify with me then? 


Suppose this future “me” is, by any measure, a “better” me, a new and improved me. My “better 
angel,” as Stephen Pinker, is inclined to put it. How much “betterness” can I tolerate before I 
must say that, although the world might be a better place with this future me, I would not be 
present in that world. That me is not me. My identity has dropped from consideration. 


How different is “too” different? You would have to ask me! Look me up. I am in the world in 
which there is something I identify with. You, the universe, whatever... may stipulate that a me 
or something or someone significantly like me exists in another possible world. Some purpose 
may be served by that. Maybe a future “me” would be flattered to be a future instance of me 
now, or find succor in tracing a genealogy to me or something in the history of my doings, but 
this is not a case of moral or agential identity. Do or think as you please, but there is not an 
ounce of reason for me to take such identity seriously. I can channel Napoleon. I don’t think 
Napoleon ever cared to imagined this, nor do I think he would have reason to. I think he could 
have discerned, even then, the difference between anything about him and whatever got into my 
head. 


More generally, “post-human,” means just that: post-human, an existential task not in our job- 
description, as it were. Should we worry about “post-humans” or truly “trans-humans”? Maybe. 
But surely not because “post” or “trans” are suffixed with “-humans.” Something more 
interesting needs to be said. 


We can say that imagination outstrips the socio/cultural/legal, psychological, and moral forms of 
identity. We can imagine states, events, or entities we cannot identify with. Push the boundaries 
of similarity that ground these (non-logical) forms of identity and the meaningfulness of identity 
drops out. (The barest nudge in the direction of difference is sufficient to making logical identity 
irrelevant.) 


We can imagine lots of things that have nothing to do with us. If we bother, despite, it is 
quintessentially idle curiosity: what it would mean for a community of artificial intelligences to 
supercede us, for example. Outside of a pastime, it would mean nothing to us because we would 


not have a presence in such a community. “Meaning” there and then would be about them, not 


us." 


10. Something like this must base the fascination of family genealogical research. 

11. I am assuming common extant understandings of the parameters of empathy. To encompass the possibility of 
empathic relations with inanimate objects (as in object sexuality or deep ecology) would obviate the boundaries of 
empathic moral regard discussed here... The inclusion of such more developed or “evolved” ethical orientations 
might paradoxically sharpen an intuition of moral finitude: i.e., that we may come to feel imperatives to forfeit 
purdurance: stop creating the conditions for suffering as suggested by some negative utilitarians (David Benatar), or 
get out of the way of other things more worthy than us to exist, a view of some hyperbolic deontologists (Otto 
Weininger). 


Imagination pushed, in other words, erodes then, ultimately, destroys relevance.'* And a failure 
of imagination? Imagination comes too late for a deer in the head lights. 


—Victor Munoz 


July 2019 


the Philosophy club 
O 


12. This is related to why science fiction is both morally and aesthetically inferior to, say, literary fiction. The best 
of the latter never sacrifices authenticity for clarity. Science is fundamentally methodological. That is both virtue 
and vice. Likewise, analytic philosophy sometimes suffers the same liability. 


10 


